Stochastic calculus, non-linear filtering, and the internal model principle: implications for articulatory speech recognition

نویسنده

  • Gordon Ramsay
چکیده

A stochastic approach to modelling speech production and perception is discussed, based on Itô calculus. Speech is modelled by a system of non-linear stochastic differential equations evolving on a finite-dimensional state space, representing a partiallyobserved Markov process. The optimal non-linear filtering equations for the model are stated, and shown to exhibit a predictorcorrector structure, which mimics the structure of the original system. This is used to suggest a possible justification for the hypothesis that speakers and listeners make use of an “internal model” in producing and perceiving speech, and leads to a useful statistical framework for articulatory speech recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal filtering and smoothing for speech recognition using a stochastic target model

This paper presents a stochastic target model of speech production, where articulator motion in the vocal tract is represented by the state of a Markov-modulated linear dynamical system, driven by a piecewise-deterministic control trajectory, and observed through a non-linear function representing the articulatory-acoustic mapping. Optimal ltering and smoothing algorithms for estimating the hid...

متن کامل

A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm

Current techniques for training representations of the articulatory-acoustic mapping from data rely on arti cial simulations to provide codebooks of articulatory and acoustic measurements, which are then modelled by simple functional approximations. This paper outlines a stochastic framework for adapting an arti cial model to real speech from acoustic measurements alone, using the EM algorithm....

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

The status of functional phonological information in statistical speech recognition

The choice of speech production models as a basis for Automatic Speech Recognition (ASR) is often taken to have two straightforward implications for the topology of recognition systems. Firstly, it establishes a set of articulatory properties whose elements are the basic linguistic units to be extracted from the signal. Secondly, it predefines an internal classification of these properties whic...

متن کامل

One-model speech recognition and synthesis based on articulatory movement HMMs

One-model speech recognition (SR) and speech synthesis (SS) based on a common articulatory movement model are described herein. The SR engine has an articulatory feature (AF) extractor and an HMM based classifier that models articulatory gestures. Experimental results of a phoneme recognition task show that the AF outperforms MFCC even if the training data are limited to a single speaker. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998